Learning to Transform and Select Elementary Trees for Improved Syntax-based Machine Translations

نویسندگان

  • Bing Zhao
  • Young-Suk Lee
  • Xiaoqiang Luo
  • Liu Li
چکیده

We propose a novel technique of learning how to transform the source parse trees to improve the translation qualities of syntax-based translation models using synchronous context-free grammars. We transform the source tree phrasal structure into a set of simpler structures, expose such decisions to the decoding process, and find the least expensive transformation operation to better model word reordering. In particular, we integrate synchronous binarizations, verb regrouping, removal of redundant parse nodes, and incorporate a few important features such as translation boundaries. We learn the structural preferences from the data in a generative framework. The syntax-based translation system integrating the proposed techniques outperforms the best Arabic-English unconstrained system in NIST08 evaluations by 1.3 absolute BLEU, which is statistically significant.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Syntax Augmented Machine Translation via Chart Parsing with Integrated Language Modeling

We present a hierarchical phrase-based translation model which annotates and generalizes existing phrase translations with syntactic categories derived from parsing the target side of a parallel corpus. We associate target parse trees for each training sentence pair with a search lattice constructed from the existing phrase translations on the corresponding source sentence, and consider techniq...

متن کامل

Syntax Augmented Machine Translation via Chart Parsing with Integrated Language Modeling

We present a hierarchical phrase-based translation model which annotates and generalizes existing phrase translations with syntactic categories derived from parsing the target side of a parallel corpus. We associate target parse trees for each training sentence pair with a search lattice constructed from the existing phrase translations on the corresponding source sentence, and consider techniq...

متن کامل

Detecting and Correcting Syntactic Errors in Machine Translation Using Feature-Based Lexicalized Tree Adjoining Grammars

Statistical machine translation has made tremendous progress over the past ten years. The output of even the best systems, however, is often ungrammatical because of the lack of sufficient linguistic knowledge. Even when systems incorporate syntax in the translation process, syntactic errors still result. To address this issue, we present a novel approach for detecting and correcting ungrammati...

متن کامل

Embedded Adaptive Machine Translation Environments

In this paper we present a machine translation environment for the automatic translation from Japanese into German. An important point regarding its implementation is that it is completely embedded in the widely used spreadsheet program Excel to ensure its easy use by any potential user. The complete lexical data as well as all transfer rules are clearly arranged on worksheets to make it possib...

متن کامل

Statistical Translation Model Based On Source Syntax Structure

Syntax-based statistical translation model is proved to be better than phrasebased model, especially for language pairs with very different syntax structures, such as Chinese and English. In this talk I will introduce a serial of statistical translation models based on source syntax structure. The tree-based model uses the one best syntax tree for translation. The forest-based model uses a comp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011